perm filename 0[0,BGB]5 blob sn#059728 filedate 1973-08-30 generic text, type T, neo UTF8
COMMENT ⊗   VALID 00005 PAGES 
RECORD PAGE   DESCRIPTION
 00001 00001
 00002 00002	DRAFT THESIS OUTLINE.					DECEMBER 1972
 00005 00003	I. MEMORY STRUCTURE.
 00009 00004	I.B. Region-Edge Image Representation.
 00010 00005	II. PROCESS.
 00011 ENDMK
⊗;
DRAFT THESIS OUTLINE.					DECEMBER 1972

                          GEOMETRIC VISION
                      - draft thesis outline -

                           B. G. Baumgart


ABSTRACT:

	This thesis  is about  a computer  vision system  based on  a
geometric  model  of the  objects  being viewed.  In  principle, this
vision system is simply  a process that can be  applied to a reel  of
video tape to  compute blueprints or  geodetic  maps. Applications of
this  system to object recognition, scene  analysis and robot vehicle
control are demonstrated.


CONTENTS:

	I. MEMORY.

	   A. 	Representation of a Geometric Mental Universe.
	   B.	Contour-Region-Edge Image Representation.
	   C.	Semantic, Feature and Predicate Representation.

	II. PROCESS.

	   A.	Image Prediction.
	   B.	Image Perception.
	   C.	Image Comparison.
	   D.	Camera Locus Solution.
	   E.	World Model Modification.
		   1.	delete object from map.
		   2.	add known object to map. (recognition).
		   3. 	add or alter object in dictionary.

	III. APPLICATION.

	   A.	Blocks and Block Scenes.
		   1. deletion of a block from a scene.
		   2. addition of blocks to a scene.
	   B.	Tools and Table Top Scenes.
		   1. complicated object perception.
		   2. known object recognition.
	   C.	A Robot Vehicle and Outdoor Scenes.
		   1. known road servoing.
		   2. landscape perception.
I. MEMORY STRUCTURE.

	In order to get a computer to deal with the physical world it
must  have  a  data  representation  on  which computations involving
space, time, shape, size and the appearance of things can be done. In
this  section,  a  representation  for  the  topology,  geometry  and
photometry of everyday things is  explained.  The  data
structures  discussed  are  implemented  as  small  blocks  of  words
containing pointers and data in the fashion  usual  to  graphics  and
simulation;  an introduction to this technology can be found in Knuth
[1]; and although the language of implementation  is  PDP-10  machine
code,  the  data  and  functions  presented below are accessible from
higher level languages like LISP and ALGOL.

I.A. Representation of a Geometric Mental Universe.

	At the top of the data structure is a  single  universe  node
from  which  everything  else can be reached.   Immediately below the
universe node is a ring  of  world  models.   A  robot  dealing  with
physical world sensor input, such as video data, has one of its world
models dedicated to simulating  the  immediate  here  and  now;  this
mental  world  is  called the reality world model. In addition to the
reality world, a robot may have  fantasy  world  models  for  problem
solving, planning or for recalling platonic object prototypes. In the
following, a two world mental universe will be the most common,  with
the  reality world being referred to as a "map" and the fantasy world
being referred to as a "dictionary".

	Geometric world models have four  basic  kinds  of  nodes:
body, face, edge and vertex. The face, edge and vertex nodes are used
to form polyhedrons which may be attached to body nodes.  Body  nodes
in  turn  are  connected  to  each other in rings and trees to form a
world model. Additional kinds of nodes  discribe  cameras  and  light
sources  as  well  as  temporary  data  such  as shadows, spines, and
trajectories.

	...continuation of this section follows AIM-179,
	"Winged Edge Polyhedron Representation" - Baumgart.
I.B. Region-Edge Image Representation.

	The image data structure  presented  in  this  section  is  a
computer's  internal  notation  for  what  is  vulgarly called a line
drawing; the common term is misleading because it  does  not  suggest
the  equally  important  space between the lines; terms closer to the
idea would be "mosaic drawing" or "stained glass window drawing".

The  data  structure  has  main  levels:  TV  raster,  video
intensity contour, arc contour, and region-edge.
	...continuation of this section follows SAILON-71,


II. PROCESS.

   A.	Image Prediction.
   B.	Image Perception.
   C.	Image Comparison.
   D.	Camera Locus Solution.
   E.	World Model Modification.
	   1.	delete object from map.
	   2.	add known object to map. (recognition).
	   3. 	add or alter object in dictionary.

III. APPLICATION.

   A.	Block Scenes.
	   1. deletion of a block from a scene.
	   2. addition of blocks to a scene.
   B.	Tools and things.
	   1. complicated object perception.
	   2. known object recognition.
   C.	Robot Vehicle.
	   1. known road servoing.
	   2. landscape perception.